-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enhancement]: observability integration without using agenta hosted apps #1567
[Enhancement]: observability integration without using agenta hosted apps #1567
Conversation
… rewrite agenta singleton init
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @aybruhm
I did not finish the review, I am a bit too tired of that :) But I thought I'd leave some of the comments I made.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aybruhm I thought a bit about the problem, unfortunately I have to leave early today (in a bit) so I am just going to write my main thoughts. I hope these make sense. I'd love to get your feedback on this:
There are three workflows for instrumentation in agenta:
- Using an application hosted in agenta, we take care of adding the code snippets
- Using an application in shell
- Using an application somewhere else (hosted somewhere else)
We have designed the SDK for instrumentation for the first case. To instrument stuff in the first case, we use the following:
ag.init
→ The only reason for this is to set the variables api_key, app_id,, and initialize the stuff for the config.llm_tracing()
→ Initializes the tracing object@ag.entrypoint
→ Callsllm_tracing()
to initialize the tracing singleton, then start the parent_span. In addition it does another million things for using the app: adding the argument to fastapi, the bash entrypoints…@ag.span
→ Saves the span in the current tracing singleton
The question I am asking are:
- How can we simplify instrumentation to the max. Remove anything that is not required.
ag.init()
- Not really required for the instrumentation per se. We can instead use the environment variables for initializing the tracing object in case the agenta singleton does not exist (or the input variables)
llm_tracing
We can keep this, as a way to use tracing with ag.init(), but for the pure observability use case, we can simply use the Tracing constructor and make sure that we use the environment variables if nothing is provided@ag.entrypoint
→ I think we should have a different decorator that only implements initializing the tracing object, and calling start_parent_span. Basically has only one responsability. (ag.trace
)@ag.span
Nothing changes there
So the way the final implementation would be:
# set the env vars
os.environ["AGENTA_API_KEY"] = xxx
os.environ["AGENTA_APP_ID"] = xxx
@ag.span(type="LLM")
async def llm_call(prompt):
chat_completion = await client.chat.completions.create(
model="gpt-3.5-turbo", messages=[{"role": "user", "content": prompt}]
)
tracing.set_span_attribute(
"model_config", {"model": "gpt-3.5-turbo", "temperature": ag.config.temperature}
) # translate to {"model_config": {"model": "gpt-3.5-turbo", "temperature": 0.2}}
tokens_usage = chat_completion.usage.dict()
return {
"cost": ag.calculate_token_usage("gpt-3.5-turbo", tokens_usage),
"message": chat_completion.choices[0].message.content,
"usage": tokens_usage,
}
@ag.span
async def generate(country: str, gender: str):
"""
Generate a baby name based on the given country and gender.
Args:
country (str): The country to generate the name from.
gender (str): The gender of the baby.
"""
prompt = ag.config.prompt_template.format(country=country, gender=gender)
response = await llm_call(prompt=prompt)
return {
"message": response["message"],
"usage": response["usage"],
"cost": response["cost"],
}
In case someone has two spans in the same code using two different apps
# set the env vars
os.environ["AGENTA_API_KEY"] = xxx
@ag.span(type="LLM")
async def llm_call(prompt):
chat_completion = await client.chat.completions.create(
model="gpt-3.5-turbo", messages=[{"role": "user", "content": prompt}]
)
tracing.set_span_attribute(
"model_config", {"model": "gpt-3.5-turbo", "temperature": ag.config.temperature}
) # translate to {"model_config": {"model": "gpt-3.5-turbo", "temperature": 0.2}}
tokens_usage = chat_completion.usage.dict()
return {
"cost": ag.calculate_token_usage("gpt-3.5-turbo", tokens_usage),
"message": chat_completion.choices[0].message.content,
"usage": tokens_usage,
}
@ag.trace(app_id="")
async def generate(country: str, gender: str):
"""
Generate a baby name based on the given country and gender.
Args:
country (str): The country to generate the name from.
gender (str): The gender of the baby.
"""
prompt = ag.config.prompt_template.format(country=country, gender=gender)
response = await llm_call(prompt=prompt)
return {
"message": response["message"],
"usage": response["usage"],
"cost": response["cost"],
}
@ag.trace(app_id="")
async def somepotherlogic(country: str, gender: str):
prompt = ag.config.prompt_template.format(country=country, gender=gender)
response = await llm_call(prompt=prompt)
return {
"message": response["message"],
"usage": response["usage"],
"cost": response["cost"],
}
The issue in the second example is that we have currently one singleton which we are using. So maybe this use case is not feasible now?
Love to hear your thoughts
Thanks for the review, @mmabrouk. I thought about this last night, and I created a POC to test how the implementation would work and it did. Observability would still work if we:
Regarding having @ag.trace # ag.entrypoint will be added after ag.trace in the case where the user wants to deploy their LLM app to agenta
async def generate(country: str, gender: str):
prompt = ag.config.prompt_template.format(country=country, gender=gender)
response = await llm_call(prompt=prompt)
return {
"message": response["message"],
"usage": response["usage"],
"cost": response["cost"],
} |
The use case is. We just need to revise our implementation of Tracing to support multiple singletons if we want to have different spans for different applications. The behaviour of the Tracing would not change, only thing we would need to do is modify the Tracing constructor to support multiple singletons, and also introduce a mechanism that would switch between different singletons based on the provided app_id. |
…genta_decorator & span files and integrate their contents into their individual decorator file
…int decorator that enables tracing
…ut-using-agenta-hosted-apps
…enta-hosted-apps' into fix-obs-pr-2
… into fix-obs-pr-2
Improvements to observability PR
Description
This PR proposes integrating observability into LLM applications with Agenta, even if the applications are not hosted on Agenta.
Related Issue
Closes cloud_#338
Relative commons_#43
Additional Information
A pre-alpha version of the SDK has been deployed to pypi for testing. You can check the obs-app on cloud.beta and review the observability traces created with the example in this PR.